Extending Temporal Data Augmentation for Video Action Recognition

نویسندگان

چکیده

Pixel space augmentation has grown in popularity many Deep Learning areas, due to its effectiveness, simplicity, and low computational cost. Data for videos, however, still remains an under-explored research topic, as most works have been treating inputs stacks of static images rather than temporally linked series data. Recently, it shown that involving the time dimension when designing augmentations can be superior spatial-only variants video action recognition [34]. In this paper, we propose several novel enhancements these techniques strengthen relationship between spatial temporal domains achieve a deeper level perturbations. The results our outperform their respective Top-1 Top-5 settings on UCF-101 [55] HMDB-51 [38] datasets.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Combining Spatio-Temporal Appearance Descriptors and Optical Flow for Human Action Recognition in Video Data

This paper proposes combining spatio-temporal appearance (STA) descriptors with optical flow for human action recognition. The STA descriptors are local histogram-based descriptors of space-time, suitable for building a partial representation of arbitrary spatio-temporal phenomena. Because of the possibility of iterative refinement, they are interesting in the context of online human action rec...

متن کامل

Multi-Task Zero-Shot Action Recognition with Prioritised Data Augmentation

Zero-Shot Learning (ZSL) promises to scale visual recognition by bypassing the conventional model training requirement of annotated examples for every category. This is achieved by establishing a mapping connecting low-level features and a semantic description of the label space, referred as visual-semantic mapping, on auxiliary data. Reusing the learned mapping to project target videos into an...

متن کامل

Compressed Video Action Recognition

Training robust deep video representations has proven to be much more challenging than learning deep image representations and consequently hampered tasks like video action recognition. This is in part due to the enormous size of raw video streams, the associated amount of computation required, and the high temporal redundancy. The ‘true’ and interesting signal is often drowned in too much irre...

متن کامل

Action recognition in video

Automatic action recognition in video has a broad array of applications, from surveillance to interactive video games. Classic algorithms usually use handcrafted descriptors such as SIFT (see [5]) or HOG (see [3]) to compute feature vectors of videos, and have achieved promising results in the past (see [7]). More recently, Quoc Le and Will Zou at the Stanford AI lab have proved that ISA featur...

متن کامل

Recognition of Visual Events using Spatio-Temporal Information of the Video Signal

Recognition of visual events as a video analysis task has become popular in machine learning community. While the traditional approaches for detection of video events have been used for a long time, the recently evolved deep learning based methods have revolutionized this area. They have enabled event recognition systems to achieve detection rates which were not reachable by traditional approac...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Lecture Notes in Computer Science

سال: 2023

ISSN: ['1611-3349', '0302-9743']

DOI: https://doi.org/10.1007/978-3-031-25825-1_8